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The problem of vertex coloring in random graphs is studied using methods of statistical physics 
' and probability. Our analytical results are compared to those obtained by exact enumeration and 

I Monte-Carlo simulations. We critically discuss the merits and shortcomings of the various methods, 

, and interpret the results obtained. We present an exact analytical expression for the 2-coloring 

problem as well as general replica symmetric approximated solutions for the thermodynamics of the 
graph coloring problem with p colors and i('-body edges. 
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I. INTRODUCTION 



O ' Methods of statistical physics have recently been applied to a variety of complex optimization problems in a broad 
, range of areas, from computational complexity ^ to the study of error correcting codes and cryptography ^. 
Graph coloring is one of the basic Non-deterministically Polynomial (NP) problems. The task is to assign one of 
p-colors to each node, in a randomly connected set of vertices, such that no edge will have the same colors assigned to 
^ ' both ends. The feasibility of finding such a solution clearly depends on the level and nature of connectivity in the graph 
c/2 I and the number of colors. The very existence of a solution is in the class of NP-complete problems Q. An extension 
of the problem to the case of hyper-edges comprising more than two vertices is also of practical significance [Q . 

Recent success in the application of statistical physics techniques to computational complexity problems, naturally 
lead to the belief that they may be applied to a wide range of computational complexity tasks, among them is the 
graph coloring problem. 

^ [ In this paper we map the graph coloring problem, with p-colors, onto the anti- ferromagnetic p-spin Potts model 
Q ■ this facilitates the use of established methods of statistical physics for gaining insight into the dependence of graph 
O ' colorability on the nature and level of its connectivity, and the phase transitions that take place. The suggested 
framework comes with its own limitations; we critically discuss what can, and cannot be calculated via the methods 
of statistical mechanics. 

^ , The statistical physics approach is based on the introduction of a Hamiltonian or cost-function, and the calculation 
CO of the typical free energy in the large system limit. From the free energy one can obtain the typical ground state 
, energy, which in turn allows one to make predictions on the graph colorability. A non-zero ground state energy 
\l indicates that, under the given conditions, random graphs are typically not colorable. Our theoretical results are 
restricted to the replica symmetric (RS) approximation (see and are, for the 2-color problem (which is 

■ solvable in linear time) in perfect agreement with those obtained by numerical methods; for the 3-color problem the 

■ results are only in qualitative agreement with those obtained by numerical methods. The theoretical results can be 
"j--; , systematically improved by using replica symmetry breaking (RSB) approximations, although our current results do 

not provide a direct indication for a breakdown of the RS approximation. 
^ , Apart from determining merely the colorability of the graph, the ground state energy also tells us what is the typical 
I ■ minimal fraction of unsatisfied edges when the graph is non-colorable. Furthermore, the ground state (residual) entropy 
■ O [ gives us information about the number of different coloring schemes that share the minimum number of unsatisfied 
^ . edges. 

The suggested framework covers a range of possible variations of the original problem. However, only a limited 
number of them can be studied in a single paper; we therefore restrict this study to regular p = 2 and p = 3 color 
problems on random graphs with 2-vertex edges (i.e. with 2-body interactions in the statistical physics terminology). 
In this context, regular stands for the fact that all edges connect the same number of vertices and impose the same 
5J] ' color constraint on the vertices they connect, and that all vertices have the same available color set. Possible variations 
[ include many-i^ vertex edges (iiT-body interactions); mixtures of edges with different K values and/or with different 
local constraints imposed on the colors of the vertices involved; constraints on the overall frequencies of vertices of a 
certain color; mixtures of vertices with different available color sets; other probability distributions of the number of 
edges per vertex, etc. 

Our results, in agreement with results obtained elsewhere seem to indicate, that for p > 3 there is a 1st order 
transition for the colorability as a function of the average graph connectivity, from probability 1 to 0, at some critical 
average connectivity. This implies that in these cases a vanishing ground state energy implies that the graph is 
p-colorable, while a non-zero ground state energy indicates that the graph is typically not p-colorable. 

Contrasting results obtained from the theoretical framework with numerical studies in the case of p = 2 exposes 
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inherent limitations of the statistical physics based analysis. Using a completely different approach, we also obtain an 
exact expression for the probability that large random graphs with 2-vertex edges are 2-colorable, finding a 2nd order 
transition for the colorability as a function of the graph's average connectivity, from non-zero to zero probability. 
This result shows that in general a zero ground state energy does not automatically imply that a graph is typically 
colorable. 

The paper is organized as follows: In section II we define the problem and introduce the notations used, while in 
section [II we introduce the statistical physics framework. Section IV introduces the results obtained from the analysis 



as well as numerical results obtained by exact enumeration and Monte-Carlo simulations. The case of 2-colorability 
is studied in section M; discussion and conclusions are presented in section VI . 



II. GRAPH COLORING - DEFINITIONS AND NOTATION 



In a general set-up, we consider regular random graphs Q{Ny,K,X) consisting of Ny vertices, connected to each 
other by (hyper) edges. Each (hyper) edge connects exactly K distinct vertices. The connectivity is then described 
by the tensor j^}, the elements of which are 1 is there is an (hyper) edge connecting the vertices {ji--jK}, and 

otherwise. Note that the total number of possible (hyper) edges in the graph is given by Np^/g = (^^^ ' while 

the total number of possible (hyper) edges a given vertex j may be involved in, is given by Npf,/y = ^ ^ ^ . The 

overall connectivity of the graph Q(Ny,K^X) is described by the parameter A, which gives the average number of 
edges each vertex is involved in. Hence, for large graphs (i.e. Ny oo), the fraction of the total number of edges 
and the total number of vertices Ny, is typically given by 

+ a, 

In a random graph, this is obtained by considering all (i.e. Np^^g) possible if-tuples {ji--jK} of vertices, and by 
assigning 

1 with probability 



T^{h-3K} \ with probability P„e = 1 - ^"^^ 



In the large system limit (i.e. Ny — s- oo), the number of edges per vertex (rie) is then Poisson distributed: 

Pine ^k)=( \ f-^) ' (l ' ^ exp(- A), fc = 0, 1, 2, .., ex. . (3) 



N I \ N I A-l 

The most studied case is that oi K ~ 2, in which one considers conventional edges (or 2-body interactions); graphs 
with if > 3 are also considered in other contexts, for instance, the assignment of examination rooms to classes 0, in 
which case one generally speaks of K hyper edges (or X-body interactions). Although we will derive expressions for 
general K, in this paper we will limit ourselves to the analysis of random graphs with K = 2. 

Now we assume that each vertex j can take a color Cj out of a set {/ij.i, ..,/ij.pj} of Pj colors, its color set. A 
coloring problem on a graph is determined by the constraint(s) on the colors of vertices connected by a(n) (hyper) 
edge. For instance, one can demand that none of the colors Cj of the vertices connected by an edge are the same, or 
that the colors Cj of the vertices connected by an edge are not all the same (note that both constraints are identical 
for K = 2). Although in principle, one can consider scenarios where the color set may differ from vertex to vertex, 
and where the color constraints may differ from edge to edge, in the present paper we restrict ourselves to the case 
where all vertices have the same color set {^i, = {1, of p colors, and where each edge imposes the same 

color constraint on the vertices it connects. The actual color of a vertex j is indicated by Cj G {1, ..,p\, and we denote 
a coloring of the entire graph by c = {ci, .., cat^}. 

In this context our goal is to determine the probability that a randomly generated graph with average connectivity 
A, and a given color set and color constraints, is colorable. 

Note that the K = 2 case with p available colors, is exactly the anti-ferromagnetic p-spin Potts model while 
the p = 2 case is the anti- ferromagnetic Ising model (see []lo| , pf). The only randomness present in the model, is the 
random graph connectivity. 
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III. REPLICA CALCULATION 
A. General Scenario 

We now present the statistical physics formulation of the graph coloring problem. To map this problem onto a 
statistical physics framework, we introduce a Hamiltonian or cost function for given coloring c and connectivity V: 

H(c,I?)^5]P0. (4) 

Ok 

where we have introduced the following short-hand notation for the X-tuples to keep our notation concise: 

Ok ^ {Ji-Jk} ■ (5) 

Furthermore, X{)^(c) is if the edge color constraints are satisfied and 1 otherwise, such that 7i(c) counts the number 
of unsatisfied edges. We focus on the case where colors of nodes sharing an edge should not all be the same; X()^(c) 
is then given by 

p K 
^1=1 k=l 

such that 

exp {-13 xoM = fi [1 - A Ho J = 1 - A , (7) 

/I — 1 /x=l 

where A = (1 — e^^). In addition we could put constraints on the total fraction's of edges of color /i: 

N 

= ^/m. (note that ^ = 1) • (8) 

The central quantity from which all other relevant physical quantities can be derived, is the free energy. This can be 
obtained from the partition function (with the constraints on f): 

Z{f, V) = Tr exp [ -f3 ^ (c) | \{ 5{Y, A^c, - Nf^) . (9) 

\ Ok / A" J = l 

The free energy per degree of freedom, is then obtained from JF(/,P) = — log[-E(/, I?)]. It is very hard and not 

very useful to calculate J-'{f,'D) for any specific choice of connectivity V. Therefore, we calculate the expectation 
(average) value of the free energy over the ensemble of all allowed realizations of the connectivity. The average over 
all tensors V with K non-zero elements per row, and Lj per column j is given by 

{m)v ^ ^ / ^ T7 • (10) 

Quantities of the type Q(c) — {Qy{c))y, with Qy{c) = In [Zy (c)] and Zy{c) = Tr^ f{x, y), are very common in the 
statistical physics of disordered systems. We distinguish between the (quenched) disorder y (the connectivity T) in 
our case) and the microscopic (thermal) variables x (the coloring c in our case) . Some macroscopic order parameters 
c(x,y) (the in our case) may be fixed to specific values and may depend on both y and x. Although we will not 
prove this here, such a quantity is generally believed to be self-averaging in the large system limit, i.e., obeying a 
probability distribution P{Qy{c)) = S{Qy{c) — Q(c))). The direct calculation of Q(c) is known as a quenched average 
over the disorder, but is typically hard to carry out, and requires using the replica method p^ . The replica method 
makes use of the identity (InZ) = ( limn_>o[-Z" — l]/n ), by calculating averages over a product of partition function 
replicas. Employing assumptions about replica symmetries and analytically continuing the variable n to zero, one 
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obtains solutions which enable one to determine the state of the system. We now present only the definitions and 
final expressions for the relevant physical quantities as obtained by the replica calculation. For the technical details 
we refer to appendix 

The order parameters that naturally occur in this calculation are 



N 



0,1,. 



and their definition is enforced by the introduction of the corresponding Lagrange multipliers q^^^y 
introduced short hand notations for the m-tuples of replica indices and their corresponding colors: 



(11) 



Here we have 



= 1,. 



{fi}ni = {fii\ £ = 1, ..,m}. 



(12) 



Note the difference in notation for the replica indices < . >, which all have to be different, and for the colors {.} for 
which multiple occurrence of the same color is allowed. 

Since all replicas are subject to the same disorder the corresponding variables, depending on just one replica index, 
must be equivalent (index independent): //J = /;^, = Qf^ and — q^. To proceed with the calculation, one 

needs to assume a certain order parameter symmetry for and their conjugates ^j^i" , for m > 1. The simplest 

ansatz is that all replica m-tuples (m — 2, ..,n) with the same color set {/i}m are equivalent. This ansatz is called 
the replica symmetric ansatz (RS). In RS the order parameters ^j^}" , ^j/l}" depend only on the color multiplicities 

"^Ai = Sfci ^M.Mf appearing in the m-tuple {/x}ni (i.e. gj^^-^"^ = g^, and q'^lf'^ = qm, where rn = {m^ \ ^J■ = ^, ■■,?})■ 
Note that for general positive integer n there may be m-tuples of any size up to n, therefore to^ can take the 
values 0,1,.., n under the constraint "^m ^ ^ ■ To facilitate the analytic continuation to non- integer n, it is 
now technically advantageous to write the discrete set of order parameters {qm, q^} as the moments of p-variable 
probability distributions on the interval [0, 1]^: 



qrn= qoJ'{dxn{x)} n)^=i(a;^)"^ 
q,n^ qoJ'{dxTr{x)} n^=i(-^M)'' 



(13) 



where J' dy... = Jq dy /i} ■ ■ ■ <5(X]^=iyp ^ 1)- The variables can be interpreted as the (cavity) probabilities 

that a vertex takes the colors fi e {1, and 7r(x) is their joint probability distribution. The constraint y^ = ^ 
expresses the fact that the total probability is 1 . Using the ansatz (|3|) , solving the saddle point equations with respect 
to go and qo, and taking the limit n ^ 0, we obtain the quenched free energy per edge ^e{f) for given values /: 



^ E] //j/m 



K 

KQi —Q2 — 



(14) 



taken in the extremum with respect to (/, tt, tt), where 

// p 
{dxdx 7t{x) tt{x)} log(l - ^ Xf_iXf_i) 

„/ K p K 

Q2 ^ Y\_{dxk Trixk)}\og{l - A^Yl^k, p.) 

•' k=l /i=l k=l 

^ E^(^) /'n{'^i/*(ii)}log(Eexp(^)n(l-^/,p)) 

L=0 •' 1=1 \p=l 1=1 ) 



The internal energy and entropy per edge are then given by 



df3 



I K 



exp(-/3) I lk=i ^fc,M 

=1 ^k,tj,) 



Ylidxk Tr(xk)} ^^-^ — f^— 



5e = f3{U, - T,) 



(15) 



(16) 
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Note that it is convenient to consider the energy per edge (Vie), and entropy per vertex {Sy = x^e). In this way, 
Ue is just the fraction of unsatisfied edges (i.e. < We < 1), while Sy is the entropy per degree of freedom (i.e. 
< 5, < log(p)). 

The saddle point equations are obtained by variation with respect to /, tt and tt (under the constraint that tt and 
TT are normalized) respectively, yielding: 



ffj. 
Tr{x) 
Tr{x) 



h J ti ELiexp(/.)nti(i-i^.) 

„, K-l / K-l \ 



L 



L = l 



J, _ exp(/^)n^!(l - xi^^) 
E.exp(/.)nti'( 



(17) 
(18) 
(19) 



From ( p7| ) and (19) we see that the normalizations ffj. — ^ and Xfj_ — 1 are automatically satisfied. 

Note that for the 2-color problem {p = 2) one can invoke an Ising spin representation for the colors, e.g. by 
mapping the color 1 onto spin +1 and color 2 onto spin— 1. Then, using the fact that X2 = 1 — xi, and defining 
TO = 1 — 2x2 (G [—1, 1]), one obtains a single 1-variable probability distribution 7r() for the cavity magnetization (m) 
of the vertices (spins): 



7f(m) = TT^ 



1+TO 1 — TO 



(20) 



We also note that in the absence of overall color constraints (i.e. = 0), a paramagnetic solution of the saddle point 
equations (18pj|) always exists: 



7rpm(^) ^ S [x 



1)1 



{KX-K-X) 
A 



log(p)-log(p^"^-A) 



exp(-/3) 
(pif-i_A)' 



fu 



p 



(21) 
(22) 



Finally, one should note that the expressions (|14[-|19[) are valid for any distribution of the number of edges per vertex 
P{L), although in this paper we only investigate the case where P{L) is a Poisson distribution. 



B. Two-body interactions, no color constraints 



We now derive explicit expressions for the special cases that we analyze in more detail later on: 2-body edges, 
K = 2, and no constraint on the overall color frequencies — 0, V/i). From (^8|) we obtain the relation 



dx Tf{x) g[x) = dx tt{x) g{Ax), 



(23) 



such that the free energy per edge can be written in terms of the p-dimensional probability distribution Tr{x) alone: 



Gi = I X{{dxu^{xu)} log h - A^j^xfe,^ J 

k=l \ A" k=l ) 

g, ^ 5]p(L) /"n{df, ^(fO}iogfen(i-Ax,,o) 

L •'1=1 \ /i 1=1 J 



The saddle point equation (09^ now becomes 



n{x) = E^wy /'nW; 4^0111'^ 

L = l 1=1 M 



(24) 
(25) 

(26) 
(27) 
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Since the main question we want to investigate is the colorabihty of the graph, we are specifically interested in the 
ground state energy. We therefore take the low temperature limit (i.e. (3 — > oo), where a finite contribution to the 
energy only exists when 1 — Xk^fi = ek,fj, = 0(exp(— /?)) for the same color fi for both k — 1,2; i.e. when two connected 
vertices are forced to have the same color. Then the integrand of (^6|) becomes to leading order, 



exp(-/3)E'=inLi^fc,M _ (1-^) 1 



(l + AXexp(/3)) (l + exp(/3)X) 

with 



(28) 



X = ei,^ + e2,f, - £i,tL£2,tL - ^ xi,^X2,u ^ 0(exp(-/3)) . (29) 

However, the limit /3 — oo is not easily taken analytically for the fixed point equation (|27|). As we show in appendix^, 
even in this limit, the extremizing distribution Tr(x) is non-trivial, and we have not found a way to obtain it analytically. 
We therefore solve equation (|2^) numerically to obtain the equilibrium distribution 7r{x) which is in turn used to obtain 
U and S. 

The various integrations in the saddle point equations and the resulting physical quantities are obtained by the 
Monte-Carlo method. The distribution Tiix) is obtained as the (p-dimensional) histogram of a large population of size 
Np of p-dimensional points {xi | z = 1, ..,Np}. All results presented in this paper have been obtained using Np = 10^. 
The fixed point equation ( p7| ) can then be solved by randomly updating (i.e. replacing) one of the Xi — *■ x'^. The 
update of x[ is carried out by, first, randomly picking a value L with probability P{L)^, then randomly picking L—1 
Xii's, and finally using the r.h.s. of the arguments of the (5-function in (^) to calculate the resulting components of 
x'^. This process is repeated until the histogram reaches a steady state. Once this histogram is obtained, it can be 
used to calculate the various physical quantities in similar fashion. 

Note that in order to reach a sufficient numerical precision in the low temperature limit for the components of the 
Xi, we either save x^^^ if Xi^^ < 0.5 or e^^^ = 1 — x^.^ if Xi^^ > 0.5. This avoids precision loss, e.g. in calculating 
(1 — Axi,p), when Xi,^ is very close to 1. Similar steps are taken to keep sufficient numerical precision for the r.h.s of 
the saddle point equation (p7|). 

Furthermore, it should be noted that often a very large number of iterations is needed (up to lO'^A^p) before the 
distribution becomes stationary. This, in combination with the finite population size Np = 10^, and the inherent 
randomness in the Monte-Carlo integrations, puts a limit on the achievable numerical precision of our results. 



IV. RESULTS 



We now turn to the results of the numerical evaluation of the RS expressions. 

First, it should be noted that the residual entropy Sq{X) per vertex (i.e. the logarithm of the number of colorings 
of the ground state) does not vanish for any finite A. For the 2-color problem 

5o(A) > log(2) > P{ne = 0, A) log(2) = 5,(A) > , (30) 

where Ndd^) is the number of disconnected clusters, and where P{ne = 0, A) > is the fraction of completely isolated 
vertices at given connectivity A. For each of these clusters, one can pick a single representative vertex and give it 
2 different colors; the color of all the other vertices in the cluster is then uniquely determined when the graph is 2- 
colorable. In the case of non 2-colorable cluster, there is at least 1 (and possibly more) way of coloring the remaining 
vertices such that the number of unsatisfied edges in the cluster is minimal. 
For the p-color problem 

5o(A) > Pi^e = k. A) log(p - fc) = 5i(A) > , (31) 

fc=0 

where P{ne = fc. A) > is the fraction of vertices connected by k edges at given connectivity A. A vertex connected 
to k other vertices, can at least pick between p — k colors (and more if some of the vertices it is connected to have 
the same color) whether the graph is p-colorable or not. In case the graph is not p-colorable, there is at least 1 (and 
possibly more) choices of coloring the vertices such that the number of unsatisfied edges in the graph is minimal. 

The ground state energy £'o(A) per edge can then be used as an indicator for the colorabihty of the graphs. Since 
we use the saddle point method, there may be 0{l/\/T^) fluctuations of the internal energy per edge around the 
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saddle point value. If Eo{X) > 0, this clearly precludes colorability, while for Eo{X) = the colorability may depend 
on the fluctuations. 

Note that in the absence of overall constraints on the color frequencies, the solutions always exhibit a complete color 
symmetry, as expected. In other words, the distribution Tr(x) is symmetric under permutations of the components of 
X (up to numerical precision), and the marginal distribution for each of the colors is identical: 

T^f,{x^,)= / J]^ do;^ 7r(x), -> 7r^(x^) = 7r(a;) ,V ^ (32) 

•.'0 ^ 



1. 2-color graphs 



For the 2-color problem, the results are as follows (see Figs.0 and ||): 

• For A < 1, we only find the paramagnetic solution at all temperatures and the corresponding ground state 
energy Eq{\) = 0. 

• For A > 1, from a certain (inverse) temperature Tp{X) (/3p(A)) on- wards, the paramagnetic solution coexists 
with a non-trivial solution, which can be identified as the physical one (at least in the RS approximation) by 
the fact that this solution continues to obey inequality ( ^ ) for all values of A that we have examined, while the 
continuation of the para-magnetic solution violates it. We have a positive ground state energy Eq{X) > 0, and 
in perfect agreement with the numerical experiments, this predicts ^'c(A) = for A > Ac = 1. 

The behavior of the ground state energy and entropy is presented in Fig.^ while the phase diagram and the explicit 
distribution obtained above A > 1 is presented in FigJ^. 

^From (Uq), we see that the internal energy is always positive. Furthermore, the numerical analysis indicates that 



also the entropy and the specific heat Cy = = are always non-negative, and inequality (^0|) is always 

satisfied. This implies that all quantities behave as in a proper physical system, not giving any direct indication that 
the RS-ansatz might be inaccurate. 
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FIG. 1: On the left: the ground state energy Eq{X) for p — 2. Up to A = 1 (•), £'o(A) = 0. The paramagnetic ground state 
energy _Ep,pm is always 0. On the right: the ground state entropy So (A) (full line) for p = 2, compared to its lower bound 
Si{X) ( pOj ) (dashed line), and the paramagnetic ground state entropy 5'pm(A) (dotted line). Up to A = 1 (•), So and Spm 
coincide. 



2. 3-color graphs 

For the 3-color problem, the results are as follows (see Figs.|| and ^): 

• For A ^ 4, we only find the paramagnetic solution at all temperatures, and the corresponding ground state 
energy £'o(A) = 0. 
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FIG. 2: On the left: the phase diagram (A,r), and the transition from the paramagnetic to the non-paramagnetic RS state. 
The phase transition is 2nd order in -nix). At zero temperature, _Eo = for A < 1 (x) and i5o > for A > 1 (from x on-wards). 
On the right: the stationary distribution t[{x) for p = 2, A = 2 (> Ac), P — 15. We note the peaks and non trivial distribution 
at a; ~ and a; ~ 1, indicating that many vertices are forced (not) to take a specific color. For A < Ac these peaks are absent. 
We also note the distinct peaks at a: ~ 1/2, 1/3 and other rational values. The symmetry around x = 0.5 is specific for p = 2. 



• For 4 ^ A ^ 5.1, from a certain temperature Tpm{X) on-wards, the paramagnetic solution coexists with a non- 
trivial solution, which can be identified as the physical one by comparing the free energies. The ground state 
energy Eq{X) remains 0. 

• For 5.1 ^ A, from a certain temperature Tpm{X) on-wards, the paramagnetic solution coexists with a non-trivial 
solution with a positive ground state energy Eo{X) > 0, which can be identified as the physical one by by the 
fact that this solution continues to obey inequality ( |3l| ) for all values of A that we have examined, while the 
continuation of the para-magnetic solution violates it. 

The behavior of the ground state energy and entropy is presented in Fig.^; explicit distributions obtained for various 
A values are presented in Fig.^, while the phase diagram is presented in Fig.^. 

As we will see, the numerical experiments predict that Pc(A) = 1 for A < Ac — 4.7, and that Pc(A) = for 
A > Ac — 4.7. Although the RS analysis results do not contradict the numerical ones, they are unable to identify 
Ac — 4.7 as the critical colorability value. This is reminiscent of the RS results in the K-SAT problem [Q. 

In our case, however, from (p^), we see that the internal energy is always non- negative. In addition , the numerical 
analysis shows that both entropy and specific heat Cy = = ^fy are always non-negative, and inequality ( ^ ) is 
always satisfied. This implies that all quantities behave as in a proper physical system, thus giving no direct indication 
that the RS-ansatz is wrong. 
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FIG. 3: On the left: the ground state energy Eo{\) for p = 3. Up to A ~ 5.1 (•), Eo = 0. The paramagnetic ground state 
energy i5o,pm is always 0. On the right: the ground state entropy S'o(A) (full line) for p = 3, compared to its lower bound ^((A) 
(dashed line), and the paramagnetic ground state entropy 5o,pm(A) (dotted line). Up to A ~ 4 (•), So and Sp coincide. 
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FIG. 4: On the left: the stationary distribution n(x) for p = 3, A = 4.5 (< Ac) (and A = 4.8 inset), (3 = 15. Ahhough the 
solutions clearly differs from the paramagnetic solution (a single peak at x — 1/3), the absence of peaks near a; ~ 0, 1 indicates 
that Eo{X) — 0. On the right: the stationary distribution Tr{x) for p = 3, A = 5.5 (> Ac), /3 = 15. In the inset we have enlarged 
and truncated the vertical scale, to illustrate the continuous nature of the distribution. We note the peaks and non-trivial 
distribution at x ~ and x ~ 1, indicating that many vertices are forced (not) to take a specific color, and that Eo{X) > 0. 
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FIG. 5: The phase diagram (A,T). The phase transition from a paramagnetic distribution ti{x) to a non-paramagnetic 
distribution 7r(x) is 2nd order in 7r(x). From x on- wards the RS ground state energy is positive. 



3. exact enumerations 



To validate the results obtained analytically we carried out extensive computer simulations using two different 
approaches. 

The first numerical method we use, is an exact enumeration of all the possible colorings for a given graph. Note 
that, in general, the number of possible colorings examined grows exponentially with the system size (i.e. ~ 
P{Ny) exp(ciVt, log(p— 1)), where P{N.v) is some polynomial, and where c is some constant called the attrition rate, 
see e.g. ||l2[ and references therein. Hence, for p > 3, we are fairly limited in the accessible system sizes (i.e. 
A^^ ~ 0(10^), and may expect considerable finite size effects. 

For p = 2, however, the colorability of a graph can be determined by the following linear algorithm: We start by 
picking a vertex at random, and giving it a certain color. Then, the color of all vertices it is connected to (i.e. the 
2nd generation, which is typically a finite number that depends on A), must have the opposite color, and the edges 
involved can be removed. Now one can assign the first color to all the vertices (the 3rd generation) connected to the 
2nd generation, and the edges involved are again removed. This process is repeated until either the whole graph is 
colored or a contradiction is encountered. Since this process requires only a finite number of operations per edge, 
and since the number of edges is Nf. = jNy, one can determine the 2-colorability of the graph in linear time, and 
large system sizes are accessible. It is important to note that a graph that contains any loop of odd length, is not 
2-colorable, while any graph that does not contain a loop of odd length, is. We will use this observation to obtain an 
exact expression for the 2-colorability of random graphs in the next section. 
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The 2-colorability PcW as obtained by exact enumerations for system sizes = 10^, ..,10^, as well as the 
theoretical line (for oo), are plotted in Fig.|[ We observe that Vd^) decreases continuously from Vd^) = 1 at 

A = to Pc(A) = for A > 1. These results are in full agreement with those reported in although here we have 
studied much larger systems. They are also in agreement with the results obtained by the replica method, but the 
latter is unable to distinguish between 'Pc(A) — 1 and < 7'c(A) < 1 as in both cases the ground state energy is 0. 

One should note that this linear algorithm is specific to the graph-coloring problem with p ~ 2 and K = 2. In the 
case that p > 3 and/or K > 3, the colors of the next generation are not uniquely determined by the colors of the 
previous one. The same holds for the K-SAT problem (even with K = 2) where a clause (i.e. edge) may be satisfied 
by either of its arguments or by both. 

For p > 3 it is believed that no polynomial algorithm exists to determine the p-colorability of a graph, and we 
have to resort to the exploration of the possible colorings by building up a search tree. Since we limit ourselves to 
determining whether a graph is colorable or not, we are able to introduce some criteria to reduce the problem, thus 
avoiding enumerating the full search tree. 

A first step in the reduction is pruning: a vertex that has more available colors than vertices it is connected to, will 
always be able to satisfy all edges, irrespective of their colors. Therefore, it will not determine the colorability of the 
graph, and the vertex and all its edges can be pruned. This pruning is to be done iteratively (as the pruning of one 
vertex with its edges may render other vertices prunable) , until all remaining vertices have at least as many edges as 
available colors. 

A second step is early stopping: one starts coloring the remaining vertices, keeping track of the remaining available 
colors per vertex for all uncolored vertices. One can stop exploring the search tree when a good coloring is found. 
Alternatively, when the number of remaining available colors for a vertex becomes 0, the coloring so far will lead to 
a contradiction later on, and we can abandon this branch of the search tree altogether. One then backtracks to the 
point where a coloring was still possible. 

All this greatly reduces the actual number of colorings that have to be examined, leaving it, however, exponential in 
the system size, thus greatly limiting the accessible system size. Furthermore, since we stop as soon as we encounter 
a contradiction, we have no information on the minimum number of unsatisfied edges (i.e. the ground state energy), 
or the number of colorings that yield the minimum number of unsatisfied edges (i.e. the residual entropy). In Fig.O 
we observe that the transition from 'P{\) = 1 to V{\) = becomes increasingly sharp with the increasing system size, 
and that the curves cross at A ~ 4.7. This is typical for a 1st order transition, and puts the critical connectivity for 
the infinite system at Ac — 4.7. In this limit we expect 'P(A) to go discontinuously from 1 to 0, in accordance with 
the results presented in H. 
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FIG. 6: Left: the probability that a random graph is 2-colorable. The transition from J'c(A) > to Pc 
Right: the probability that a random graph is 3-colorable. The transition from Pc(A) > to Pc{\) = 
probabilities are obtained by exact enumerations, averaged over 10'^ runs. 



(A) = is 2nd order. 
: is 1st order. The 



^. Monte-Carlo simulations 



Since exact enumerations for p > 3 are limited to relatively small system sizes, we have also performed Monte-Carlo 
simulations with simulated annealing for the p = 3 case. The simulations have been performed for system sizes 
Ny — 1000 and iV„ = 10000 and consist of the following ingredients: 

• At each temperature we perform Monte-Carlo dynamics. Starting with a configuration c with energy E{c), we 
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change the color of a randomly chosen Cj to c'j ^ cj, obtaining the new configuration with energy E{S). Then, 
if Ai? = Eic') — E{c) < we always accept the move; otherwise we accept it with probability exp(— /? AE) < 1. 

• We then gradually lower the temperature (this is known as simulated annealing [^). If the temperature is 
reduced (cooling of the system) logarithmically slowly with the system size, one is guaranteed to find the global 
minimum cq of E{c). However, logarithmically slow cooling is not feasible due to limitations in computing time. 
Therefore, we must adopt a feasible cooling scheme. Here we have opted for a linear cooling scheme, where we 
increase /3 by small steps of fixed length d/3 — 10^'*. At each inverse temperature /3 we make C Ny Monte-Carlo 
steps, and we control the cooling rate by changing C, and try to extrapolate to 1/C ^ in order to obtain 
a prediction for infinitely slow cooling. The values of C that we have considered, are C = 0.1, 1, 10, 100. The 
values of the ground state energy as obtained by the linear cooling schemes serve as an upper bound for the 
true ground state energy. 

The simulation results are presented in Figj^. We observe that the results predict that Eq{X) starts deviating 
significantly form around A ~ 4.6 — 4.7, in agreement with the exact enumeration. The very similar values that we 
obtain for the ground state energies as obtained by the simulations for both A^^ = 10'^ and Ny = 10'*, indicate that 
the finite size effects for these sizes of systems, if noticeable, fall well within limitations of the achievable numerical 
precision due to the linear cooling scheme. The results show that Eo(X) as predicted by the RS approximation is no 
longer in agreement with the numerical evidence, thus giving an indirect indication that one may have to consider a 
more complicated ansatz for the replica symmetries. A similar underestimation of the ground state energy in the RS 
approximation has been observed in the K-SAT problem j^. In that model, however, the inconsistency of the RS 
result was signaled by an (unphysical) negative ground state energy. This problem was partially solved by considering 
a more complicated ansatz for the replica symmetry (i.e. a 1 step replica symmetry breaking ansatz - IRSB ). It 
is therefore plausible that such a IRSB calculation would also improve on the prediction of the value Ac at which 
the ground state energy ceases to be (i.e. move it closer to the true value Ac — 4.7). Such a calculation (and also 
subsequent steps in Parisi's scheme for RSB) is easy to formulate, but its evaluation is numerically rather involved. 
This analysis is beyond the scope of the current paper, but will be the subject of a forthcoming study. 
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FIG. 7: The ground state energy Eq{\) as obtained with MC-simuIations with simulated annealing for — 10^ (left) and 
Nv — 10^ (right), and different cooling rates, averaged of 100 runs. The lower curve is the estimate for infinitely slow cooling 
as obtained by a quadratic extrapolation of the values for the three smallest values of 1/C, to 1/C = 0. 



V. 2 COLOR PROBLEM: EXACT ANALYSIS 



We will now derive an exact expression for the 2-colorability of random graphs, in the infinite graph size limit, for 
A S [0,1]. As we have seen, the replica analysis correctly finds Eq{X) = 0, but is unable to predict the non-trivial 
behavior of Pc(A) as observed in the exact enumerations. We do this by identifying local configurations that give 
rise to non-colorable clusters, and by calculating the probabilities of their occurrence. One should notice that the 
non-colorable local configurations are loops of odd length: We start from the probability distribution for the number 
of edges of a given vertex: P{L), L = 0, .., oo, which is a Poisson distribution. We recall from (|^) that the probabilities 
of a/no 2-edge between two given vertices are given by 
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FIG. 8: Loops of odd lengths: 3,5,7,... all of which have a finite probability of occurring in large randomly generated graphs 
for any finite A. 



The probability of no (denoted by the symbol ^) odd loops in the graph is given by 

P(-.3, -.5, -.7, -.9, ..) = P(-.3)P(-.5|-.3)P(-.7|-.3, -.5)P(-.9|-.3, -.5, -.7).. . 



(34) 



We first evaluate the probability, that 3 randomly chosen vertices form a loop of length 3. We randomly pick 3 vertices 
which can be done in ^ ^ ways. The probability that for a given set of 3 vertices each is connected with the other 
two is given by: 



r{3) = vl = ^ 



(35) 



As long as the typical loop size is finite (compared to Ny), the correlations between the different (2fc+ l)-tuples is 
0{Ny^'^) (at least 2 new edges have to be present), and are therefore negligible. Hence, the probability that there are 
no 3-loops in the graph is given by 



P(-3) = (l-P(3)) 



-exp(-— ) 



(36) 



Now we turn to the probability that there are no 5-loops, given that there are no 3-loops. We can randomly pick 5 
vertices in ( ) ways. The probability that a given set of 5 vertices forms a loop (counting all the distinct possible 



orderings 4!/2), while there are no shorter (3-) loops in the group (5 internal edges have to be excluded), is given by: 



P(5h3) 



4! 



4! 



2 ■ e • ne 3 

Therefore, the probability that there are no 5-loops in the graph is given by 



^(-5) = (1-P(5)) 



exp(-^) . 



(37) 



(38) 



We can repeat this procedure for any odd loop length 2A: + 1, k = 1,2,3... The number of internal edges to exclude 
is given by (2fc+ 1)(2A: — 2)/2, while the number of distinct orderings of the vertices in a closed loop is given by 
(2fc-hl)!/[2(2A:+l)]. Hence, we obtain: 



\2fe+l 

P(-2fc + lh3, .., -2fc - 1) ~ V{^2k + 1) ~ exp(- ^ {2k + 1) ^ ' 
The probability of no odd loops of any length (i.e. the probability that the graph is colorable) is therefore: 

1 



00 / 00 1 2k-j-l 

n7'(-2fc + l)^exp(--5] 

fe=i 



exp — -(atanh(A) — A) 



i-xy A 



(39) 



(40) 
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^From Fig.|6|, we see that this result is in perfect agreement with that obtained by exact enumeration up to the point 
where the typical odd loop length becomes of the order of the square root the system size. This point moves to the 
right (and approaches A = 1) with increasing system size. Furthermore, since for < A < 1 the probability to have 
an odd loop is 7^ (odd) < 1, the ground state energy Eq per edge is then typically 0, as the probability to have a finite 
Eq is exponentially small in iV„: 

P{U = Eo>0)^ {riodd))f-^° - . (41) 

This observation is in perfect agreement with our results from the Monte-Carlo simulations (see Fig.|^), and are also 
confirmed by our replica analysis. 

Unfortunately, for p > 3 the basic local configurations (i.e. those including a finite number of vertices) that lead to 
non-colorability cannot be enumerated so easily. Furthermore, each of the basic non-colorable local configurations has 
a vanishingly small occurrence probability. It is their collective probability (including very large configurations that 
may consist of a finite fraction of the graph) that suddenly becomes 1 at the critical A, giving rise to the observed 1st 
order transition from colorable to non-colorable graphs. 




FIG. 9: The smallest elementary non-colorable configurations for p — 3 (left), p — 4 (right), both of which have a vanishingly 
small occurrence probability in large randomly generated graphs for any finite A. 



VI. CONCLUSIONS 

We analyzed the colorability of random graphs for finite average connectivity, an important NP-complete problem. 
The statistical physics based analysis provides typical results in the infinite system size limit, complementing results 
published in the computational complexity literature. 

The results obtained are in qualitative agreement with those reported in the literature as well as with numerical 
results we obtained from exact enumeration and Monte-Carlo based solutions. 

One apparent discrepancy, in the case of 2-color graphs, has been investigated using a probabilistic analysis that 
provided exact results for the probability of colorable random graphs in the case of two colors. The analysis also 
explains the failure of the statistical physics based analysis to detect uncolorability when this comes as a result of only 
a finite number of unsatisfiable edges, since such an analysis can identify only an extensive number of such edges. 

The current analysis is the first step in the investigation of graph colorability. Future studies include: a) Refining 
the current analysis by extending it to the case of 1-step RSB. b) Investigating graphs with mixed 2 and 3-color 
vertices; this case has been studied numerically in Q but is difficult to analyze due to the different nature of two 
analyzes, c) Studying the properties of random graphs with various restrictions. These research activities are currently 
underway. 
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APPENDIX A: TECHNICAL DETAILS OF THE REPLICA CALCULATION 



We now present the technical details of the replica calculation. We calculate the average of the n-th power of the 
partition sum: 
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(Al) 



The constraints on the are enforced by the introduction of the Lagrange multipliers such that the average of 
the replicated partition sum becomes 
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The average over all tensors D with K (taken to be 2 for now) non zero elements per row, and Lj per column j is 
given by (|lO|), where the Kronecker deltas can be expressed as 5{x,y) = /gZ^^-?'-!). We now proceed with the 
calculation of T 
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where we have used the short hand notations (p^. Step (A3) is justified, because after integration over the Zj, only 
those terms in the expansion of the exponential in which each Zj occurs exactly L times will survive, and it was 
shown jl^ that in the thermodynamic limit (N^ — s- oo), in the expansion of the logarithm all higher order terms are 
negligible compared to the first order term. In step (A4), we have made the choice (0) for . 

We have that [-^IOk ~ (Sj^i -^j) l^'^'^- ' ^^'^ thermodynamic limit, such that 



n 



27ri i 



i=0 



In order to factorize the whole expression in the j's, we introduce the order parameters 

N 
J=0 



by the introduction of the corresponding Lagrange multipliers (7| 
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Following similar steps we obtain for the denominator 
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The average of the replicated partition function hence reads 
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which can be evaluated using the saddle point method for the integration variables /°, ^j^i" and 'Zj^j'" ■ In order 
to proceed with the calculation, we must make an as sumption about the symmetry between replicas, and we use the 
replica symmetric ansatz (^3|) for the terms in (AlO) that involve the order parameters: 
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to obtain the following expression for the averaged replicated partition sum 
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where 
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We now solve the saddle point equations with respect to qo and qo, and note that the structure of the (go, 9o)-dependent 
part of the denominator is exactly the same with Ji = J2 = 1, to obtain 
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where A = P{L) L, such that all terms not depending on the Ti or in the numerator and denominator cancel: 



(Z'^) ~ exp 



Nvi-nY, UU - Alog(Xi) + - log(X2) + P{L) logd^L) 
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taken in the extremum for {/,7r,7r}. So far we have performed all calculations for general positive integer n. Taking 
lim„_,o ^"^ ^ ^\ and multiplying the result with -^^-j^, we obtain the replica symmetric free energy per edge ([l^). 

APPENDIX B: LOW TEMPERATURE LIMIT 

We will now show that even in the limit (3 — > oo, the distribution 7r(a;) remains non-trivial. In order to demonstrate 
this, we concentrate on the fixed point equation (p7[). Using two explicit examples, we show how contributions to 
7r() for extremal values of the arguments (i.e. \ ~ = = C'(cxp(— /?), x^, = C'(exp(— /?)) \ v ^ ^l) may generate 
contributions to 7r() with finite argument values (i.e. 1 — = 0(1), V /i) and vice-versa. 

• 1) First, we assume that there is a finite probability density it{x) that 1 — £^j. = C'(cxp(— /3)), such that 

Xi, = C'(exp(— \ V ^ [I . Suppose now that p = 3, and consider the term in ( p7| ) with L — \ = 3. The following 
combination of xg^s {£ — 1, 2, 3) has then a finite probability density: 
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xi,v = 0(exp(-/3)) I' 7^ M 
and, to leading order, generates a contribution to the l.h.s. of ( p7|) with x: 

l-x^^l~ '^P^~f , =0(1) , M=l,2,3, (B2) 



i.e. with finite argument values. 

• 2) Second, we assume that there is a finite probability density 7r(af) that 1 — = e ^ 1, such that x^ = 
0{e) \ f fi . Suppose now that p = 2, and consider the term in (|27| ) with L—1 = 3. The following combination of 
XfS {i = 1, 2, 3) has then a finite probability density: 



1 - = £i_i, 1 - a;2,i = £2,1, £1/2,1 = ©(e) 

X3,.^0{1) i/ = l,2 

and, to leading order, generates a contribution to the l.h.s. of (p7|) with x: 
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£l,l£2,l + 1-2:3,2 £l,l£2,l + l-2;3,2 

i.e. with more extreme values of the arguments ( 0{e'^) instead of 0{e) ). 



Hence, we have shown that extreme values will generate less extreme values and vice-versa. Since the r.h.s of 
( ^ ) contains terms with all values of i, obviously (even in the limit /3 — > cxd) we cannot explicitly keep track of the 
proliferation of distributions to different values of x, and have to resort to a numerical analysis. For each value of A, 
we have to check whether in the limit /3 — > 00 a finite probability density 7r(x) is generated for extremal values of x 
(i.e. 1 — Xfj^ = C'(exp(— /?))). If this is the case, the internal energy U will be positive, and the probability that the 
graph is colorable must be 0. 



